Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization
نویسندگان
چکیده
Direct-path relative transfer function (DP-RTF) refers to the ratio between direct-path acoustic functions of two microphone channels. Though DP-RTF fully encodes sound spatial cues and serves as a reliable localization feature, it is often erroneously estimated in presence noise reverberation. This paper proposes learn with deep neural networks for robust binaural source localization. A learning network designed regress sensor signals real-valued representation DP-RTF. It consists branched convolutional module separately extract inter-channel magnitude phase patterns, recurrent joint feature learning. To better explore speech spectra aid estimation, monaural enhancement used recover spectrograms from noisy ones. The enhanced are stacked onto act input network. We train one unique using many different arrays enable generalization across arrays. way avoids time-consuming training data collection retraining new array, which very useful practical application. Experimental results on both simulated real-world show effectiveness proposed method direction arrival (DOA) estimation reverberant environment, good ability unseen
منابع مشابه
Phased Microphone Array for Sound Source Localization with Deep Learning
To phased microphone array for sound source localization, algorithm with both high computational efficiency and high precision is a persistent pursuit. In this paper convolutional neural network (CNN) a kind of deep learning is preliminarily applied as a new algorithm. At high frequency CNN can reconstruct the sound localizations with excellent spatial resolution as good as DAMAS, within a very...
متن کاملDiscriminative Binaural Sound Localization
Time difference of arrival (TDOA) is commonly used to estimate the azimuth of a source in a microphone array. The most common methods to estimate TDOA are based on finding extrema in generalized crosscorrelation waveforms. In this paper we apply microphone array techniques to a manikin head. By considering the entire cross-correlation waveform we achieve azimuth prediction accuracy that exceeds...
متن کاملAcoustic Space Learning for Sound-Source Separation and Localization on Binaural Manifolds
In this paper, we address the problems of modeling the acoustic space generated by a full-spectrum sound source and using the learned model for the localization and separation of multiple sources that simultaneously emit sparse-spectrum sounds. We lay theoretical and methodological grounds in order to introduce the binaural manifold paradigm. We perform an in-depth study of the latent low-dimen...
متن کاملThe bat head-related transfer function reveals binaural cues for sound localization in azimuth and elevation.
Directional properties of the sound transformation at the ear of four intact echolocating bats, Eptesicus fuscus, were investigated via measurements of the head-related transfer function (HRTF). Contributions of external ear structures to directional features of the transfer functions were examined by remeasuring the HRTF in the absence of the pinna and tragus. The investigation mainly focused ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing
سال: 2021
ISSN: ['2329-9304', '2329-9290']
DOI: https://doi.org/10.1109/taslp.2021.3120641